-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-99108: Release the GIL around hashlib built-in computation #104675
Conversation
This matches the GIL releasing behavior of our existing `_hashopenssl` module, extending it to the HACL* built-ins.
@msprotz - does this make sense to you? |
Do you have a reference for how the Python GC works? I'm reading https://wiki.python.org/moin/GlobalInterpreterLock, https://docs.python.org/3/c-api/init.html#thread-state-and-the-global-interpreter-lock, and https://docs.python.org/3/c-api/memory.html -- is there any other good resource? Basically I'd like to understand if:
I'd like to read up a little bit before giving you a thumbs-up. Thanks! |
CPython is purely references counted with no compaction or moving of objects in memory. The GC exists solely to deal with reference cycles. It never rearranges memory and never frees memory behind any object with a non-zero reference count. We've got a writeup on that at https://devguide.python.org/internals/garbage-collector/. It has been this way since Python 2.0 when the cyclic GC was introduced. (before that, reference cycles were memory leaks) No memory returned from Python C APIs will ever be moved or freed so long as it belongs to referenced objects. The PyBytes objects the hash functions receive always have positive refcounts by definition, thus the This PR applies identical logic to what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that makes a lock more sense and now I understand the purpose of the lock. Thanks for the pointers!
Overall I think it would be helpful to document the intent behind this locking behavior, notably:
- what you said about avoiding locking up the CPU in case there's lots of data to hash
- the defensive lock that isn't strictly required but helps guard against rogue C API clients
- the lazy initialization and the fact that the module can't assume obj->lock is non-NULL (may be uninitialized or, as I understand it, lock creation may have failed).
Other than that, as far as I can tell, this looks fine. Thanks!
Agreed, I'll work on some common code comments to apply to everywhere relevant about these patterns. |
Thanks @gpshead for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12. |
…ythonGH-104675) This matches the GIL releasing behavior of our existing `_hashopenssl` module, extending it to the HACL* built-ins. Includes adding comments to better describe the ENTER/LEAVE macros purpose and explain the lock strategy in both existing and new code. (cherry picked from commit 2e5d8a9) Co-authored-by: Gregory P. Smith <greg@krypto.org>
GH-104776 is a backport of this pull request to the 3.12 branch. |
…H-104675) (#104776) gh-99108: Release the GIL around hashlib built-in computation (GH-104675) This matches the GIL releasing behavior of our existing `_hashopenssl` module, extending it to the HACL* built-ins. Includes adding comments to better describe the ENTER/LEAVE macros purpose and explain the lock strategy in both existing and new code. (cherry picked from commit 2e5d8a9) Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
This matches the GIL releasing behavior of our existing
_hashopenssl
module, extending it to the HACL* built-ins.